Event detection in consumer videos using GMM supervectors and SVMs

نویسندگان

  • Yusuke Kamishima
  • Nakamasa Inoue
  • Koichi Shinoda
چکیده

In large-scale multimedia event detection, complex target events are extracted from a large set of consumer-generated web videos taken in unconstrained environments. We devised a multimedia event detection method based on Gaussian mixture model (GMM) supervectors and support vector machines. A GMM supervector consists of the parameters of a GMM for the distribution of low-level features extracted from a video clip. A GMM is regarded as an extension of the bag-of-words framework to a probabilistic framework, and thus, it can be expected to be robust against the data insufficiency problem. We also propose a camera motion cancelled feature, which is a spatio-temporal feature robust against camera motions found in consumer-generated web videos. By combining these methods with the existing features, we aim to construct a high-performance event detection system. The effectiveness of our method is evaluated using TRECVID MED task benchmark.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Writer Identification Using GMM Supervectors and Exemplar-SVMs

This paper describes a method for robust offline writer identification. We propose to use RootSIFT descriptors computed densely at the script contours. GMM supervectors are used as encoding method to describe the characteristic handwriting of an individual scribe. GMM supervectors are created by adapting a background model to the distribution of local feature descriptors. Finally, we propose to...

متن کامل

Compact Audio Representation for Event Detection in Consumer Media

Local audio-visual descriptors are often compactly stored using representations such as the soft quantization histogram [1]. Typically, classification performance with histogram representations is improved through the use of large codeword sets. Unfortunately, this approach runs into overfitting and scalability challenges when applied to richly diverse real-world collections. A novel “i-vector”...

متن کامل

Language recognition using language factors

Language recognition systems based on acoustic models reach state of the art performance using discriminative training techniques. In speaker recognition, eigenvoice modeling of the speaker, and the use of speaker factors as input features to SVMs has recently been demonstrated to give good results compared to the standard GMM-SVM approach, which combines GMMs supervectors and SVMs. In this pap...

متن کامل

Applying SVMs and weight-based factor analysis to unsupervised adaptation for speaker verification

This paper presents an extended study on the implementation of support vector machine (SVM) based speaker verification in systems that employ continuous progressive model adaptation using the weight-based factor analysis model. The weight-based factor analysis model compensates for session variations in unsupervised scenarios by incorporating trial confidence measures in the general statistics ...

متن کامل

Intoxicated Speech Detection by Fusion of Speaker Normalized Hierarchical Features and GMM Supervectors

Speaker state recognition is a challenging problem due to speaker and context variability. Intoxication detection is an important area of paralinguistic speech research with potential real-world applications. In this work, we build upon a base set of various static acoustic features by proposing the combination of several different methods for this learning task. The methods include extracting ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Image and Video Processing

دوره 2013  شماره 

صفحات  -

تاریخ انتشار 2013